# Ternary quantization compression
Minicpm4 8B GGUF
Apache-2.0
MiniCPM4 is an efficient large language model designed specifically for edge devices. While maintaining optimal performance at the same scale, it achieves extreme efficiency improvements, enabling over 5x generation acceleration on typical edge chips.
Large Language Model
Transformers Supports Multiple Languages

M
openbmb
324
1
Minicpm4 8B Marlin Vllm
Apache-2.0
MiniCPM4 is an efficient large language model designed specifically for edge devices, achieving extreme efficiency improvements and optimal performance at the same scale.
Large Language Model
Transformers Supports Multiple Languages

M
openbmb
200
4
Minicpm4 0.5B
Apache-2.0
MiniCPM4 is an efficient large - language model designed specifically for edge devices. Through systematic innovation, it achieves extreme efficiency improvements in four key dimensions: model architecture, training data, training algorithm, and inference system.
Large Language Model
Transformers Supports Multiple Languages

M
openbmb
415
20
Minicpm4 8B
Apache-2.0
MiniCPM4 is an efficient large language model designed specifically for edge devices. Through systematic innovation, it achieves extreme efficiency improvements in four dimensions: model architecture, training data, training algorithm, and inference system. It can achieve over 5 times faster generation speed on edge chips.
Large Language Model
Transformers Supports Multiple Languages

M
openbmb
643
103
Featured Recommended AI Models